Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models
Identifieur interne : 001565 ( Main/Exploration ); précédent : 001564; suivant : 001566Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models
Auteurs : Takashi Okada [Japon] ; Atsuhiro Takasu [Japon] ; Jun Adachi [Japon]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2004.
Abstract
Abstract: Article citations are composed of subfields such as author, title, journal, and year. It is useful to automatically identify attributes of these subfields, since they are used for linking a citation with the actual cited article. In this article, we employ a Support Vector Machine (SVM), a method of machine learning, to automatically identify subfields. We then employ a Hidden Markov Model (HMM) to improve the identification accuracy. Information from the subfields identified by the SVM, and syntactic information analyzed by the HMM, are integrated to make an accurate identification.
Url:
DOI: 10.1007/978-3-540-30230-8_46
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000639
- to stream Istex, to step Curation: 000631
- to stream Istex, to step Checkpoint: 000E00
- to stream Main, to step Merge: 001616
- to stream Main, to step Curation: 001565
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models</title>
<author><name sortKey="Okada, Takashi" sort="Okada, Takashi" uniqKey="Okada T" first="Takashi" last="Okada">Takashi Okada</name>
</author>
<author><name sortKey="Takasu, Atsuhiro" sort="Takasu, Atsuhiro" uniqKey="Takasu A" first="Atsuhiro" last="Takasu">Atsuhiro Takasu</name>
</author>
<author><name sortKey="Adachi, Jun" sort="Adachi, Jun" uniqKey="Adachi J" first="Jun" last="Adachi">Jun Adachi</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:DF534A7FD00E3F6BCB2FC98D05CB828CE9813463</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1007/978-3-540-30230-8_46</idno>
<idno type="url">https://api.istex.fr/document/DF534A7FD00E3F6BCB2FC98D05CB828CE9813463/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000639</idno>
<idno type="wicri:Area/Istex/Curation">000631</idno>
<idno type="wicri:Area/Istex/Checkpoint">000E00</idno>
<idno type="wicri:doubleKey">0302-9743:2004:Okada T:bibliographic:component:extraction</idno>
<idno type="wicri:Area/Main/Merge">001616</idno>
<idno type="wicri:Area/Main/Curation">001565</idno>
<idno type="wicri:Area/Main/Exploration">001565</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models</title>
<author><name sortKey="Okada, Takashi" sort="Okada, Takashi" uniqKey="Okada T" first="Takashi" last="Okada">Takashi Okada</name>
<affiliation wicri:level="3"><country xml:lang="fr">Japon</country>
<wicri:regionArea>Information Science and Technology, Information and Communication Engineering, The University of Tokyo, 7-3-1 Bunkyo-ku, Tokyo</wicri:regionArea>
<placeName><settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Japon</country>
</affiliation>
</author>
<author><name sortKey="Takasu, Atsuhiro" sort="Takasu, Atsuhiro" uniqKey="Takasu A" first="Atsuhiro" last="Takasu">Atsuhiro Takasu</name>
<affiliation wicri:level="3"><country xml:lang="fr">Japon</country>
<wicri:regionArea>National Institute of Informatics, 2-1-2, Hitotsubashi, Chiyoda-ku, Tokyo</wicri:regionArea>
<placeName><settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Japon</country>
</affiliation>
</author>
<author><name sortKey="Adachi, Jun" sort="Adachi, Jun" uniqKey="Adachi J" first="Jun" last="Adachi">Jun Adachi</name>
<affiliation wicri:level="3"><country xml:lang="fr">Japon</country>
<wicri:regionArea>National Institute of Informatics, 2-1-2, Hitotsubashi, Chiyoda-ku, Tokyo</wicri:regionArea>
<placeName><settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Japon</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2004</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">DF534A7FD00E3F6BCB2FC98D05CB828CE9813463</idno>
<idno type="DOI">10.1007/978-3-540-30230-8_46</idno>
<idno type="ChapterID">46</idno>
<idno type="ChapterID">Chap46</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Article citations are composed of subfields such as author, title, journal, and year. It is useful to automatically identify attributes of these subfields, since they are used for linking a citation with the actual cited article. In this article, we employ a Support Vector Machine (SVM), a method of machine learning, to automatically identify subfields. We then employ a Hidden Markov Model (HMM) to improve the identification accuracy. Information from the subfields identified by the SVM, and syntactic information analyzed by the HMM, are integrated to make an accurate identification.</div>
</front>
</TEI>
<affiliations><list><country><li>Japon</li>
</country>
<settlement><li>Tokyo</li>
</settlement>
</list>
<tree><country name="Japon"><noRegion><name sortKey="Okada, Takashi" sort="Okada, Takashi" uniqKey="Okada T" first="Takashi" last="Okada">Takashi Okada</name>
</noRegion>
<name sortKey="Adachi, Jun" sort="Adachi, Jun" uniqKey="Adachi J" first="Jun" last="Adachi">Jun Adachi</name>
<name sortKey="Adachi, Jun" sort="Adachi, Jun" uniqKey="Adachi J" first="Jun" last="Adachi">Jun Adachi</name>
<name sortKey="Okada, Takashi" sort="Okada, Takashi" uniqKey="Okada T" first="Takashi" last="Okada">Takashi Okada</name>
<name sortKey="Takasu, Atsuhiro" sort="Takasu, Atsuhiro" uniqKey="Takasu A" first="Atsuhiro" last="Takasu">Atsuhiro Takasu</name>
<name sortKey="Takasu, Atsuhiro" sort="Takasu, Atsuhiro" uniqKey="Takasu A" first="Atsuhiro" last="Takasu">Atsuhiro Takasu</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001565 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001565 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:DF534A7FD00E3F6BCB2FC98D05CB828CE9813463 |texte= Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models }}
This area was generated with Dilib version V0.6.32. |